In Automatic Text Summarization, preprocessing is an important phase toreduce the space of textual representation. Classically, stemming andlemmatization have been widely used for normalizing words. However, even usingnormalization on large texts, the curse of dimensionality can disturb theperformance of summarizers. This paper describes a new method for normalizationof words to further reduce the space of representation. We propose to reduceeach word to its initial letters, as a form of Ultra-stemming. The results showthat Ultra-stemming not only preserve the content of summaries produced by thisrepresentation, but often the performances of the systems can be dramaticallyimproved. Summaries on trilingual corpora were evaluated automatically withFresa. Results confirm an increase in the performance, regardless of summarizersystem used.
展开▼